Semi-supervised Learning from Only Positive and Unlabeled Data Using Entropy

نویسندگان

  • Xiaoling Wang
  • Zhen Xu
  • Chaofeng Sha
  • Martin Ester
  • Aoying Zhou
چکیده

The problem of classification from positive and unlabeled examples attracts much attention currently. However, when the number of unlabeled negative examples is very small, the effectiveness of former work has been decreased. This paper propose an effective approach to address this problem, and we firstly use entropy to selects the likely positive and negative examples to build a complete training set; and then logistic regression classifier is applied on this new training set for classification. A series of experiments are conducted. The experimental results illustrate that the proposed approach outperforms previous work in the literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Deep Bayesian Active Semi-Supervised Learning

In many applications the process of generating label information is expensive and time consuming. We present a new method that combines active and semi-supervised deep learning to achieve high generalization performance from a deep convolutional neural network with as few known labels as possible. In a setting where a small amount of labeled data as well as a large amount of unlabeled data is a...

متن کامل

Semi-supervised learning for text classification using feature affinity regularization

Most conventional semi-supervised learning methods attempt to directly include unlabeled data into training objectives. This paper presents an alternative approach that learns feature affinity information from unlabeled data, which is incorporated into the training objective as regularization of a maximum entropy model. The regularization favors models for which correlated features have similar...

متن کامل

Iterative Hybrid Algorithm for Semi-supervised Classification

In the typical supervised learning scenario we are given a set of labeled examples and we aim to induce a model that captures the regularity between the input and the class. However, most of the classification algorithms require hundreds or even thousands of labeled examples to achieve satisfactory performance. Data labels come at high costs as they require expert knowledge, while unlabeled dat...

متن کامل

Query Selection via Weighted Entropy in Graph-Based Semi-supervised Classification

There has recently been a large effort in using unlabeled data in conjunction with labeled data in machine learning. Semi-supervised learning and active learning are two well-known techniques that exploit the unlabeled data in the learning process. In this work, the active learning is used to query a label for an unlabeled data on top of a semisupervised classifier. This work focuses on the que...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010